∆BLEU: A Discriminative Metric for Generation Tasks with Intrinsically Diverse Targets

نویسندگان

Michel Galley

Chris Brockett

Alessandro Sordoni

Yangfeng Ji

Michael Auli

Chris Quirk

Margaret Mitchell

Jianfeng Gao

Bill Dolan

چکیده

We introduce Discriminative BLEU (∆BLEU), a novel metric for intrinsic evaluation of generated text in tasks that admit a diverse range of possible outputs. Reference strings are scored for quality by human raters on a scale of [−1, +1] to weight multi-reference BLEU. In tasks involving generation of conversational responses, ∆BLEU correlates reasonably with human judgments and outperforms sentence-level and IBM BLEU in terms of both Spearman’s ρ and Kendall’s τ .

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

deltaBLEU: A Discriminative Metric for Generation Tasks with Intrinsically Diverse Targets

متن کامل

BLEU deconstructed: Designing a Better MT Evaluation Metric

BLEU is the de facto standard automatic evaluation metric in machine translation. While BLEU is undeniably useful, it has a number of limitations. Although it works well for large documents and multiple references, it is unreliable at the sentence or sub-sentence levels, and with a single reference. In this paper, we propose new variants of BLEU which address these limitations, resulting in a m...

متن کامل

Concept-to-text Generation via Discriminative Reranking

This paper proposes a data-driven method for concept-to-text generation, the task of automatically producing textual output from non-linguistic input. A key insight in our approach is to reduce the tasks of content selection (“what to say”) and surface realization (“how to say”) into a common parsing problem. We define a probabilistic context-free grammar that describes the structure of the inp...

متن کامل

Syntactic SMT Using a Discriminative Text Generation Model

We study a novel architecture for syntactic SMT. In contrast to the dominant approach in the literature, the system does not rely on translation rules, but treat translation as an unconstrained target sentence generation task, using soft features to capture lexical and syntactic correspondences between the source and target languages. Target syntax features and bilingual translation features ar...

متن کامل

A Framework for Discriminative Rule Selection in Hierarchical Moses

Training discriminative rule selection models is usually expensive because of the very large size of the hierarchical grammar. Previous approaches reduced the training costs either by (i) using models that are local to the source side of the rules or (ii) by heavily pruning out negative samples. Moreover, all previous evaluations were performed on small scale translation tasks, containing at mo...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2015

∆BLEU: A Discriminative Metric for Generation Tasks with Intrinsically Diverse Targets

نویسندگان

چکیده

منابع مشابه

deltaBLEU: A Discriminative Metric for Generation Tasks with Intrinsically Diverse Targets

BLEU deconstructed: Designing a Better MT Evaluation Metric

Concept-to-text Generation via Discriminative Reranking

Syntactic SMT Using a Discriminative Text Generation Model

A Framework for Discriminative Rule Selection in Hierarchical Moses

عنوان ژورنال:

اشتراک گذاری